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Chanter  I 


INTRODUCTION 

DM  Data  base  management  systems  (DBMS)  are  be  inn  sold  to  data 
processing  users  in  increasing  numbers.  Several  reasons  are  being  given 
for  this  popularity: 

a.  D3KS  are  easier  to  use  than  conventional  data  handling 
techniques,  r 

b.  DBMS  solve  file  handling  problems  of  complex  data  structures. 

c.  The  DEMS  are  utilized  by  the  user's  competitors — suggesting 
greater  efficiency,  lover  cost,  or  other  competitive  factors. 

Regardless  of  the  reasons,  and  their  validity,  the  fact  remains 

that  DBMS  are  enjoying  great  popularity  in  today's  ADR  marketplace. 

With  that  popularity  comes  an  influx  of  packages  which  claim  to  provide 

vast  capabilities  to  support  data  bases. 

The  increasing  number  of  choices  in  DBMS  packages  presents  the 

potential  user  with  a problem  of  evaluating  and  selection  the  system 

(19 :?reface) 

which  fits  the  specific  needs  of  the  installation.  The 

conolexity  of  the  nroblem  is  increased  bv  the  relative  inexoerience  of 

(39,1:9) 

most  users  with  DEIS,  its  properties,  and  its  capabilities. 

This  paper  is  aimed  at  the  potential  user  of  a DBMS . A collectin' 
of  criteria  which  may  be  used  to  evaluate  a DBMS  is  provided.  Sufficient 
detail  description  is  included  to  permit  a "cookbook"  approach,  selecting 
those  criteria  which  bear  on  the  individual  installation  environment.  - 


Chapter  2 provides  a short  background  of  the  growth  and  directio 
of  ISMS . It  was  included  to  help  establish  the  picture  of  the  IB  MS 
environment  and  give  the  reader  a feeling  for  the  potential  impact  of 
this  capability. 

Chapter  3 asks  questions  intended  to  determine  whether  the 

installation  of  a D3MS  is  the  practical  solution  to  an  installation's 

problems.  There  are  just  as  many  practical  reasons  for  not  installing 

a IBIS  as  there  are  reasons  for.  A DB2-S  is  a very  expensive  tool,  at 

least  initially.  The  long-term  nayoffs  must  be  significant  to  iustifv 
(16)  ' * 

its  use. 

Chapters  L through  10  contain  the  detailed  evaluation  questions. 
Tilth  few  exceptions  each  question  is  intended  to  stand  by  itself.  Mher. 
multiple  related  questions  are  presented,  they  are  generally  grouped 
together  as  a unit,  ^Tien  asking  the  questions  of  this  type  to  vendors, 
it  is  recommended  that  they  be  separated  so  that  they  may  be  answered 


individually 


BACK®  OITC 


Data  management  systems  began  appearing  in  the  early  lc60's. 
These  early  systems  were  known  as  File  Management  Systems  (FMS) . The'- 
were  so-named  because  new  approaches  to  processing  of  "files"  of  that 
time  were  developed.  The  Formatted  File  System,  known  as  the  ITS,  was 
developed  for  the  IBM  liiOl  computer  and  later  updated  to  the  IBM  liulO. 
This  system  provided  the  user  with  the  ability  to  process  single  files 
with  a basic  two- level  tree  structure. 

These  systems  were  subsequently  converted  to  the  IEK  36 0 and 
todav  are  known  as  the  Modular  Data  Svstem  (MOT'S)  and  the  MIPS-FFS 

(16:259,13) 

systems,  respectively. 


The  FKS  techniques  implemented  by  many  vendors  followed  the  same 
essential  approach.  They  ’were,  and  still  are,  limited  to  concurrent 
access  of  onlv  one  or  two  files.  The  organizational  structure  is  usuallv 

(51) 

a two -level  tree  (hierarchical  structure) . 

Data  management  systems  (DMS)  expanded  the  capabilities  of  the  FMS. 

By  increasing  the  number  of  files  which  could  be  concurrently  accessed, 

the  DMS  permitted  more  interaction  between  files  within  the  installation. 

Some  DKS  packages  increased  the  levels  of  tree  structure  to  three  and 

a few  beyond  that.  IBM's  Information  Management  Svstem  (IKS),  as  imoie- 

(11) 


merited  on  the  360  computer,  is  a good  example  of  a DKS. 


Requirements  for  flexibility  anr.  consolidation  of  data  files 


pushed  system  designers  to  beveion  the  ne'h.,  and  current,  level  cf 


in  its  scope  of 


implementation,  the  DBMS.  The  D3I-3  differs  fro r.  the  DMS 

(hu,hS) 

support.  The  BUS  is  intended  to  permit  concurrent  access  to 

multiple  files  structured  in  a similar  •ray.  Each  file  is  independent 

of  the  others  for  some  purposes  and  related  for  others.  For  example,  a 

personnel  file  is  related  to  a payroll  file  when  viewing  the  complete 

information  on  an  employee.  Yet  each  may  be  processed  independently  of 

the  other  for  personnel  and  payroll  purposes. 

The  DBFS  seeks  to  integrate  all  facets  of  the  installation’s  data 
(17:5-6) 

into  one  single  data  base.  The  individual  identification  of 

(19:1-1) 

separate  files  is  lost  as  common  data  is  used  by  all  applications. 

The  payroll  department  may  still  have  its  own  specialized  data  which  is 

not  available  to  other  users  of  the  data  base  but  the  common  information 

(i.e.,  name,  address,  telephone  number)  is  shared  bv  the  personnel  and 
(1,2,3) 

payroll  groups. 

The  DBI-E  therefore  becomes  the  facility  to  implement  an  integrated 

data  base  covering  all  facets  of  the  user's  organization,  eliminating 

(36: 23  >6) 

separate  files  and  the  extra  effort  required  to  maintain  them. 

The  data  processing  industry  has  long  been  noted  for  its  confusing 
terminology.  Often  different  terms  coined  by  individual  vendors  have  the 
same  meaning.  The  DBFS  area  is  no  different.  New  meaning  for  old  terms 
have  appeared  along  with  a nurdcer  of  new  terms.  The  glossary  lists  a 
few  of  these  terms  alonr  with  some  common  synonyms  and  a brief  explanation. 

The  Conference  on  Data  System  Languages  (CCDASYL)  was  organized 
in  the  late  l?50's  to  define  a standard  language  for  computer  programming. 
COBOL  was  the  result.  CCDASYL  has  continued  to  work  on  new  standards, 
and  the  data  base  management  area  has  received  magor  emphasis. 


In  1959  the  Data  Base  Task  Group  (BBTC-)  of  CCDASYL  released  a 

DBMS  standards  recommendation  to  the  data  processing  industry.  This 

report  has  been  upgraded  several  tines,  and  work  continues  today  on 

inprovine  the  specification.  The  1971  revision  of  the  1969  report 

covers  the  Bata  Definition  Language  (DDL)  and  the  CCBDL-oriented  Data 

(17) 

Manipulation  Language  (DKL) . A further  revision  of  the  DDL  was 

published  in  1973  by  the  Data  Definition  Lanmuage  Committee  (BDLC)  , 

(18) 

successor  to  the  D3TG. 

The  most  significant  point  to  note  about  the  CCDASYL  effort  is 

its  acceptance.  Hard ware  anc  software  vendors  alike  are  developing 

new  DBMS's  using  the  CCDASYL  guidelines.  Examples  are  Cullinane's  IBIS, 

DEC  DBMS 10,  Honeywell's  IDS-2,  and  Univac's  DKS1100.  Industry  leaders 

predict  that,  within  a few  years,  enough  major  implementations  of  the 

CCDASYL  approach  will  be  available  to  make  cross-hardware  transition  of 

(11,15,16,17) 

DBMS  data  bases  relatively  easy. 

The  2BK  user  organizations  SHARE  and  GUICE  developed  a joint 
recommendation  for  data  base  development  which  was  published  in  November 

(U7) 

1970.  This  document  described  the  desired  features  to  be  found  in 

the  ideal  DBMS.  Unfortunately , and  in  contrast  to  the  CCDASYL  specifica- 
tion, it  did  not  provide  the  ruidelines  required  to  develop  the  DB.'IS. 

As  a result,  the  SHARE /GUIDE  document  has  received  less  attention  than 


it  deserves 


Chapter  3 


DO  YOU  REALLY  ISED  A DBMS? 


Numerous  articles  have  been  written  expounding  the  merits  of  the 
(9,32,37)  ‘ Oil) 

DBMS.  Others  cite  reasons  for  procurement  of  a DBMS  package. 


Very  few  writers  haVe  stood  "against  the  tide”  to  question  the  reasons 

(39,U6, L8,U?) 

for  entering  this  new  arena  of  AD?  processing. 

There  are  many  good  reasons  for  adding  a DBMS  to  the  capabilities 

of  an  installation  Several  will  be  discussed  later  in  this  chapter. 

However,  the  potential  user  must  keep  in  mind  that  the  reasons  may  not 

apply  to  every  environment  and  the  result  of  the  evaluation  study  mav 

(16) 

conclude  that  installation  of  a DBMS  is  not  the  proper  solution. 

Each  evaluation  should  be  conducted  with  this  possibility  as  one  of  the 
stated  alternatives.  The  pre-conceived  notion  that  a DBMS  will  be 
installed,  with  evaluation  being  used  to  determine  which  one,  mav  result 

m 

in  a very  costly  error. 

As  ADF  use  within  organizations  have  grown,  the  complexitv  of 

(2) 

the  applications  have  grown  too.  Many  "independent"  applications 
have  evolved  to  satisfy  the  needs  of  individual  users  or  departments. 
Where  this  nas  occurred,  each  user  has  provided  their  own  data  and 
received  reports  from,  their  own  exclusive  files. 

This  unregulated  and  independent  growth  of  applications  has 
resulted  in  many  related  but  disconnected  files.  As  corporate  need  for 
data  grows,  the  need  to  relate  previously  separate  files  becomes  more 

urgent.  Unfortunately,  many  files  are  structured  in  such  a way  that 

A 


ready  association  is  impractical.  ''Chen  attempts  to  modify  the  files 

to  improve  relationships  are  made,  the  complexity  may  overwhelm  the 
(12,13) 

programming  staff. 

The  BBJ-S  is  often  advertised  as  the  solution  to  this  type  of 

problem.  Properly  implemented,  it  may  well  solve  the  need'  to  relate 

complex  data.  The  capabilities  of  the  various  DBMS  packages  varies 

(16,19:2-13) 

widely  in  this  area.  The  extent  and  complexity  of  the 

files  to  be  related,  Present  and  future,  will  bear  extensive Iv  on  which 

(39,U2,U9) 

DEIS,  if  any,  is  selected. 

The  independent  development  of  files,  as  described  above,  results 

in  redundant  data  stored  in  manv  files.  Costs  associated  with  the  entrv, 

(19:1-1) 

maintenance,  anc  reference  to  redundant  data  may  be  significant. 

The  integrity  of  the  data  base  is  compromised  when  redundant  data . 

(1,5,1?) 

stored  in  separate  files,  is  not  consistently  maintained.  Yhen 

two  departments  each  enter  the  same  item  into  two  files,  the  probability 
of  error  increases.  The  timeliness  of  entry  is  seldom  consistent.  One 
department  enters  the  data  before  another,  and  the  overall  data  base  is 
no  longer  "in  sync."  If  the  item  being  updated  is  used  as  a key  to  relate 
files  together,  the  delay  will  result  in  wrong  answers  if  a query  is 

I 

entered  before  the  second  user  updates  his  portion  of  the  data  base. 

Many  of  today's  complaints  about  bad  computer  outputs  can  be 
traced  to  this  problem.  Only  by  limiting  the  entry  of  data  to  a single 


(39,1x1) 


path  into  a single  logical  location  in  the  data  base  can  the  integrity 

a, 5) 

of  the  data  be  assured.  The  DBJS  is  capable  of  supporting  this 

recuirenent.  The  package?  currently  available  varv  widely  in  their  ability 

(27,25) 

to  reduce  or  eliminate  data  redundancy. 


p 

The  volume  of  data  to  be  stored  in  the  data  base  affects  the 

performance  of  most  DBMS’s.  Depending  on  the  method  enclosed  to  relate 

data,  svstem  efficiency  may  degrade  seriously  when  the  data  base 

(38) 

approaches  one  million  bytes  in  size. 

Inverted  file  systems  typically  update  slowly  and  retrieve  rapidly. 
The  installation  which  processes  a high  volume  of  updates  (1C  per  cent 
or  more  of  the  data  base  size  oer  month)  will  find  most  inverted  files 

(15, 22,  no,  m) 

systems  excessively  costly  to  operate.  This  problem  may 

force  the  user  of  a high-volume , large-size  data  base  to  either  eliminate 
use  of  inverted  svstems  or  devise  an  indeoendent  uodatins  method  which 

(19:1-110 

will  reduce  update  overhead. 

Chain  pointer  systems  vary  widely  in  their  processing  efficiency. 
As  a general  rule,  the  longer  the  data  chain  the  slower  the  system 
ooerates.  Very  large  data  bases,  therefore,  orocess  much  slower  than 

U6) 

small  ones.  Careful  design  of  the  data  base  structure  can  overcome 

some  of  this  inefficiency.  However,  to  adequately  "tune1'  the  structure 

it  is  necessary  to  know  ahead  of  time  the  anticipated  loading  volume  of 
(31, 32,16) 

each  record  type . 

The  complexity  of  relating  data  within  an  organization  was  the 

downfall  of  the  "total  system"  concent  of  the  mid-1960’s.  The  programming 

skills  and  resources  to  implement  such  a system  were  lacking  in  all  but 

the  largest  installations.  The  inability  of  the  industry  to  sunport 

"total  systems"  created  much  distrust  and  reluctance  to  embark  on  another 

(17) 

such  venture. 

Today’s  Management  Information  System  (MIS)  developments  are  s 
subset  of  the  "total  system."  The  original  plans  have  bear,  scaled  down 
to  a level  which  can  be  developed  without  the  need  of  specialized  skills 


and  hardware . The  DBMS,  with  its  improved  data  handling  capabilities,  ma 
again  open  the  door  to  the  "total  system."  A few  systems,  principally 
those  which  adhere  to  the  CCDASYL  specifications,  have  the  ability  to 

(17, up, IS) 

relate  data  in  the  manner  it  occurs  in  the  "real  world." 

The  DBMS's  relational  ability  is  one  of  its  most  important  assets 

The  structure  of  most  businesses  is  hierarchical  in  nature.  The  data 

handled  by  the  departments  or  segments  of  a business  crosses  over  the 

boundaries  of  the  organisational  hierarchy.  Efficient  and  cost-effective 

use  of  data  demands  that  the  DBMS  be  able  to  effect  such  a cross-over 

with  a minimum  of  step-retracing  and  redundant  data.  Therefore  a true 

networking  capability  is  recuired  for  a DBMS  to  effectively  support  an 

(-1,13) 

application  area  as  broad  as  the  complete  business.  If  such  an 

implementation  is  not  available  on  the  hardware  used  by  an  organization, 

it  may  be  better  to  defer  procurement  until  one  is  available. 

On-line  is  a term  which  is  commonly  used  in  AD?  today.  Many 

organizations  are  developing  direct  input  to  their  data  bases  and  bvoass- 

(20) 

ing  traditional  keypunching  of  data.  The  timeliness  and  accuracy  of 

data  is  generally  enhanced,  and  costs  of  data  preparation  are  reduced. 
On-line  data  entry  implies  that  on-line  retrieval  is  also  desired.  The 
complexities  of  developing  on-line  data  entry  and  its  improved  timeliness 
is  seldom  justified  on  the  basis  of  costs  and  accuracy  alone.  Management 
desires  faster  response  to  its  inquiries  to  meet  the  increased  demands  of 

(23) 

the  marketplace. 

On-line  retrieval  requires  that  those  areas  of  the  data  base  whic 
may  be  requested  from  a terminal  be  available  immediately,  "hen  the  data 
is  stored  in  several  data  bases,  this  requirement  becomes  difficult  to 
meet.  An  integrated  DIMS,  using  fully  networked  data  structures , can 


(2?) 

permit  access  to  am-  portion  of  the  data  base.  The  entire 

organizational  rata  structure  is  nroces sable  iron,  any  terminal  through 

(Uii) 

one  set  of  software.  This  feature  dramatically  reduces  develop  rent 

costs  and  operating  tine. 

Again,  nuch  care  must  be  taker,  to  determine  if  the  candidate 
DBMS  implementations  can  support  the  scope  of  the  organization’s  data 
needs.  If  not,  nrocurenent  of  a "next  best"  substitute  will  be  a waste 

m 

of  tine  and  money. 

No  organization  should  purchase  a DBMS  just  because  their 

competitors  have  bought  one.  The  status  symbol  of  owning  a DBI'S  .may 

actually  inprove  the  competitors  position  if  their  operation  is  suited 

to  DBMS  usage  and  yours  is  not.  Competitive  pressure  aprlied  by  trorer 

application  of  a DBMS  justifies  purchase  only  when  the  organization  has 

determined  that  a DBMS  will  reduce  the  rressure  by  imo"oved  ef ^ic i<=  ncv 

(13,11) 

of  operation. 


ST  L. 
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UiiC  y W 

TYPES  CF  DBMS  AVAILABLE 

One  of  the  most  confusing  aspects  of  DBMS  evaluation  for  the 
unfamiliar  is  the  different  types  of  DEIS  implementations  available. 

Each  vendor  claims  his  is  the  best,  most  cost  effective,  and  flexible. 

For  the  purpose  for  which  it  was  designed,  most  DBMS  packages  perform 
very  well. 

Data  is  structured  by  the  packages  using  either  data  networking 
or  a subset.  Networking  of  data  should  not  be  confused  with  the  network- 
ing of  communications  equipment  supporting  an  organization.  The  communi- 
cations network  describes  the  relationship  between  points  where  users  of 
an  ADP  system  enter  and  receive  data.  The  data  network  describes  the 

relationship  assumed  between  elements  of  data,  records,  and  files  within 

(1*0) 

a data  base. 

Data  networks  provide  the  most  flexible  and  useful  form  of  data 
relationships.  Only  through  the  network  approach  can  data  be,  freely 
related  so  that  this  data  represents  the  actual  relationships  it  occupies 
in  "real  world"  conditions.  DBMS  implementations  which  permit  unrestricte 
networks  of  data  recuire  more  effort  to  design  the  data  base  structure 

(U2,Ui,I5) 

but  are  easier  to  program  and  use. 

The  hierarchical  or  "tree  structure"  method  of  data  relationship 

is  the  most  often  used  subset  of  networks.  A na i or it v of  early  implemen- 

(16) 

tations  utilized  this  technique.  The  hierarchical  technique  has  the 

advantage  of  being  easy  to  implement.  It  has  the  disadvantage  of 


11 


restricted  relationship  to  records  other  than 


1 


those  directly  above  or 

below  in  the  tree.  This  restriction  makes  access  to  related  records  in 

other  trees  or  branches  of  the  tree  slow  and  difficult.  Some  packages 

have  developed  coupling  techniques  which  use  auxiliary  record?  or  tables 

to  locate  related  records.  While  this  accomplishes  the  end  result,  it 

(19) 

is  inefficient  in  data  storage  space  and  processing  time. 

Each  DBMS  package  utilizes  some  form  of  networking  or  hierarchi- 
cal data  structure.  The  method  used  to  link  the  individual  data  records 
together  within  the  network  or  hierarchy  also  differs  considerably. 

One  of  the  most  often  heard  but  least  understood  methods  of 
connecting  related  data  together  is  the  "inverted  list."  This  technique 
stores  data  record?  in  the  data  base  (again  by  several  different  methods 
depending  on  vendor)  and  associates  the  data  through  lists  of  pointers 
stored  elsewhere.  It  is  possible  to  establish  many  lists  to  point  to 
certain  data  values  in  records  containing  those  values.  In  some  cases 
the  data  itself  is  removed  from  the  record  and  referenced  to  the  list. 

This  feature  reduces  the  volume  of  data  stored  in  the  data  base  but 

(15,22,111) 

increases  the  overhead  of  processing  data. 

The  most  important  feature  of  the  inverted  list  technique  is  its 
random,  or  ad  hoc,  retrieval  speed.  The  use  of  many  lists  pointinr 
through  the  data  base  makes  the  ac  hoc  query  very  fast.  The  same  is  not 
true  of  sequential  retrieval.  Sequentially  produced  reports  are  often 
slow.  Update  efficiency  degrades  on  an  accelerating  curve  as  the  volume 
of  data  stored  in  the  data  base  increases. 

Chaining  is  the  other  most  common  method  of  data  linkinr  in  use 
today.  This  method  requires  the  olacing  of  a ohysical  pointer  in  the 
data  base  record  to  identify  the  next  record  in  the  strinr  of  data.  Ofte-' 


I 


reverse  pointers  are  used  also.  The  pointers  use  data  storage  space 

and  represent  overhead.  Where  complex  data  networks  are  present,  it  is 

possible  for  pointers  to  occupy  more  space  than  the  data. 

Chain  pointers  are  implemented  in  a variety  of  ways.  Some  systems 

rely  completely  on  the  pointers  to  relate  data  records.  Others  require 

that  subsidiary  records  carry  the  physical  key  value  which  relates  them 

to  the  owner.  This  approach  is  extremely  wasteful  of  data  storage  space 

and  creates  an  update  bottleneck.  The  key  field  of  the  owner  cannot  be 

modified  without  modifying  all  subsidiary  member  records  also.  As  a 

result,  those  trackages  which  utilize  this  method  of  data  relationship 

(1?) 

normally  do  not  permit  updating  key  fields. 

logical  pointer  arrays  are  related  to  both  chain  pointers  and  to 

inverted  lists.  The  member  records  are  related  to  the  owner  occurrence 

through  a secondary,'  list.  This  list  contains  the  data  base  addresses  of 

all  of  the  member  records.  Logical  pointers  eliminate  the  need  for  chain 

pointers  imbedded  in  the  physical  records,  saving  data  storage  space. 

The  pointer  list  is  available  in  the  same  manner  as  an  inverted  list, 

(33:216,  36:136-6) 

improving  the  retrieval  qualities  of  the  data  base. 

It  must  be  emphasized  that  all  DBMS  implementations  have  some 
overhead  associated  with  relating  occurrences  of  data  together.  The  over- 
head nay  occur  within  the  data  base  storage  area  itself,  or  it  may  occur 
in  auxiliary'  storage  areas.  The  "bottom  line"  of  determining  DBMS  over- 


head is  the  total  storage  space  required  for  all  dsta  sets  necessary  %o 
operate  the  DBMS.  Don't  be  fooled  by  vendor  claims  that  one  packers  is 


more  efficient  because  no  pointers  are 
The  overhead  is  still  there.  It  is  si: 


present  in  the  main  data  base  area. 

( 33 : 136 

:ply  distributed  differently. 


Chapter  5 


EVALUATION  OF  DBMS  VEND  CP. S 

Selection  of  a DBMS  requires  a careful  investigation  into  the 
qualifications,  background,  and  capabilities  of  the  vendors  of  LETS 
packages.  Unless  the  user  has  the  capacity  to  assume  support  of  the 
software  package  at  sore  future  point,  the  stability  of  the  vendor 
should  be  of  prime  importance. 

Structural  Background 

When  was  the  vendor's  company  organized? 

Is  the  vendor  incorporated? 

Is  the  stock  publicly  held? 

Is  the  stock  listed  on  any  stock  exchange? 

What  is  the  vendor's  Dun  & Bradstreet  rating? 

The  first  group  of  questions  are  intended  to  identify  the 
structural  background  of  the  vendor.  It  is  important  to  know  how  long 
the  vendor  has  been  in  business  as  an  indicator  of  its  stability  and 
ability  to  survive  over  the  long  term. 

An  incorporated  vendor  is  less  apt  to  dissolve  as  a result  of  the 
death  or  departure  of  a principal  than  is  a proprietorship  or  partnership, 
'"hen  the  vendor's  stock  is  publicly  held  and  listed  on  a stock  exchange, 
a further  indication  of  possible  stability  is  provided. 

It  may  be  wise  to  investigate  the  dividend  policy  in  these  cases 
to  be  sure  that  the  company  maintains  a conservative  financial  profile. 

A high  dividend  rate  mav  indicate  that  inadequate  research  and  deve lccmen' 
of  the  DBMS  line  is  being  done.  Finally,  the  Dun  fr.  Bradstreet  ratine 
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good  indicator  of  the  financial  stability  of  the 


i e 


of  the  vendor  is  a 


(39,11;) 


corroanv. 


Location 

• • Where  is  the  vendor's  main  office  located? 

Does  the  vendor  maintain  other  offices  within  the  United  States? 
What  is  the  location  of  the  closest  office? 

Does  the  vendor  provide  technical  support  personnel  at  all 
offices? 

Is  2h-hour  technical  support  available  at  all  offices?  If  not, 
is  2li-hour  technical  support  available? 

The  location  of  support  facilities  for  the  DBMS  is  very  important. 

The  ability  of  the  vendor  to  train  user  personnel  and  to  provide  technical 

assistance  is  affected  by  his  location.  The  distribution  of  sales  and 

technical  support  bears  heavily  on  the  speed  and  cost  of  support.  The 

make-up  and  qualifications  of  local  office  personnel  are  important  when 

determining  what  level  of  support  may  be  expected.  The  installation 

which  operates  around  the  clock  must  have  2h-hour  technical  sunnort 

(he,  18,  no 

available . 


Personnel 

How  many  personnel  worked  for  the  vendor  on  January  1,  19759 
On  January  1,  1976? 

How  many  of  the  vendor's  personnel  were  dedicated  to  'DBMS  support 
on  those  dates? 

’•.'hat  percentage  of  the  DBMS  support  personnel  is  devoted  to 
technical  support? 

People  are  the  greatest  asset  of  the  software  vendor.  The  ability 
of  the  vendor  to  produce,  market,  and  support  a quality  DBMS  product  is 
dependent  upon  the  make-up  of  its  personnel.  It  is  useful  to  know  the 
number  of  people  working  for  the  vendor  on  the  same  date  over  a t’--o-  or 
three-year  period.  This  reflects  the  intent  of  the  vendor  to  support,  a 
growing  number  of  customers.  The  number  of  personnel  actually  supporting 


/ 

/ 
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the  DEI'S  on  the  comparison  dates  provides  an  indication  of  the  importance 


of  the  DEI-5  product  within  the  vendor's  conpany. 


(L2,I1,I2) 


Diversification 

Does  the  vendor  market  any  other  software  product? 

What  percentage  of  the  vendor's  revenue  comes  from  the  DBMS? 

Hovr  many  installations  of  the  DBMS  were  installed  on  January  1, 
1975?  On  January  1,  1976? 

VJere  any  installations  discontinued  during  the  period? 

How  many  different  industries  are  using  the  DBIiS? 

Diversification  increases  the  stability  of  a conpany.  The  soft- 
ware business  is  no  exception.  The  vendor  who  relies  entirely  on  a DBMS 
and  its  related  packages  for  its  economic  survival  is  forced  to  adopt 
policies  less  stable  than  one  who  has  other  products  to  absorb  some  of 
the  costs  of  operation.  If  other  products  represent  less  than  50  per 
cent  of  the  vendor's  income,  a marginally  successful  DBMS  product  could 
affect  corporate  stability  to  the  point  where  the  D3>5  product  would  be 
dropped. 

The  growth  of  the  DBMS  over  a year's  period  provides  some  indica- 
tion of  the  market  penetration  and  popularity  of  the  package.  It  may  also 
be  an  indicator  of  the  vendor's  marketing  capabilities.  This  marketing 
factor  will  often  show  up  in  a large  percentage  of  non-technical  personnel. 
The  total  number  of  installations  is  meaningful  only  if  the  number  of 
installations  which  dropped  the  DBMS  is  also  known.  A high  turnover  in 
users  often  indicates  a lack  of  continued  customer  satisfaction. 

Finally,  a users  list  provides  an  indication  of  the  ability  of 
the  DBMS  to  support  a broad  application  area.  A users  list  oriented  only 
to  a single  or  to  a few  industries  warns  of  a package  which  has  limited 

(15) 

application. 
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The  physical  environment  into  which  the  DBMS  '-’ill  enter  rust  he 
considered  as  part  of  the  evaluation  process.  This  environment  includes 
the  capabilities  of  the  user  to  support  the  DBMS  and  the  amount  of 
supportive  programming  required  to  beep  BEKS  operations  functioning 
smoothly. 


Hardware  Hecuir erne  nt  s 


Is  the  DBMS  implemented  on  more  than  one  computer  manufacturer's 
equipment?  If  so,  which  ones? 

Is  implementation  of  the  DBMS  consistent  across  all  hardware 
lins  s 7 

Does  installation  of  the  DBMS  require  procurement  of  special 
devices,  materials,  or  equipment? 

’••rnat  amount  of  main  memoir’-  is  required  to  effectively  utilise 
all  advertised  features  of  the  DBMS? 

Can  the  DBMS  be  operated  in  less  main  memory  soace? 

If  so:  Vfnat  is  the  minimum  main  memoir’  required  to  operate 
the  DBMS?  T-rhat  degradations  in  operati.nr  effectiveness  and 
feature  support  occurs  with  the  reduced  memory  usace? 

How  many  disk  units  (spindles,  modules)  are  required  to  assure 
efficient  ooeration  of  the  DBMS? 

Must  all  disk  modules  be  on  the  line  durinr  operation  of  the 
DEIS? 

Each  DBMS  package  has  different  hardware  requirements . Some  are 
very  thrifty  in  their  use  of  core  storage  and  disk  space.  Others  require 
large  amounts  of  both  to  operate  effectively.  The  options  open  to  the  use 
increase  when  the  DBMS  is  implemented  on  more  than  one  line  of  commuters. 
The  user  may  select  different  hardware  for  varied  uses  and  still  overate 
the  same  DBMS.  If  this  approach  will  be  seriously  considered,  it  is 
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important  to  determine  if  ail  versions  of 
are  at  the  sane  level  of  implementation. 

Several  DBMS  packages  require  that 
equipment  and/or  software  to  assure  effect 
places  added  burdens  on  the  user  in  terms 


the  DBMS  cr  ci.f'f'ere rrt  ecu t*. 

the  user  purchase  supplener.uar 
ive  operation.  This  condition 
of  cost  and  maintenance  of  the 


supplementary  items. 


• DBMS  vendors  regularly  understate  the  amount  of  main  memory 
necessary  to  operate  their  package  effectively.  In  each  instance  the 
package  v:ill  operate  with  reduced  main  memory  but  usually  with  a coinci- 
dent reduction  in  feature  capability  or  operating  efficiency.  The 
disparity  occurs  when  vendors  advertise  the  minimum  memory  required  to 
operate  the  DBMS  without  explaining  the  reduced  performance  caused  by 
extensive  use  of  overlays.  Performance  increases  as  the  use  of  main 
memory  increases,  reducing  the  number  of  overlays  employed. 

The  number  of  physical  disk  units  required  to  operate  the  DBJ-S 

at  its  advertised  speed  is  often  larger  than  what  is  apparent  in  vendor’s 

literature.  Using  fewer  than  the  optimum  number  of  units  may  seriously 

degrade  performance.  (An  excellent  example  is  found  in  the  case  of  Mil's 

System  2000:  Six  data  sets  are  required  for  each  data  base.  ‘ If  placed  o 

one  disk  unit  instead  of  six,  a L0-  to  50-oercent  throughout  degradation 

(HI) 

is  encountered. 

Flexibility  of  operation  improves  if  the  user  has  the  ability  to 
switch  disk  units  or  commit  only  a portion  of  the  disk  space  to  the  DBMS 
for  a given  execution. 


Auxiliary  Support  Programs  (Utilities) 

the 


Does  the  DBMS  package  include  utilities  to  save  and  restore 
data  base? 


Do  save  and  restore  utilities  include  loading  density  statistics 
on  the  areas  being  processed? 

Can  areas  of  the  data  base  be  selectively  saved  and  restored? 

Are  backup  and  recovery  utilities  included  as  part  of  the  vendor's 
package? 

Can  data  base  recovery  utilities  be  operated  from  the  operator's 
console? 

' Does  the  vendor  supply  utilities  necessary  to  repair  the  data 
base  in  event  of  partial  destruction? 

Can  utilities  be  used  by  operators  and  non-programming  personnel 
with  a high  probability  of  accuracy? 

Can  DBMS  utilities  be  executed  while  the  system  is  running? 

Kov  much  in-house  programmer  effort  is  required  to  write  utility 
programs  to  support  the  DBMS? 

All  DBMS  packages  require  some  auxiliary  support.  None  of  the 
packages  on  the  market  today  does  everything  internally.  The  number, 
versatility,  and  usability  of  supporting  utility  packages  can  either 
dramatically  enhance  the  usefulness  of  the  DBMS  or  detract  seriously 
from  its  ability  to  support  the  user.  Small  installations  must  be  particu- 
larly aware  of  the  depth  of  utility  supoort.  Those  packages  which  do  not 
supply  adequate  utilities  force  their  users  to  dedicate  expensive  senior- 
level  programming  support  to  the  DBMS.  The  presence  of  extensive  utilities 
does  not  in  itself  indicate  a lot-:  level  of  support  costs.  The  ease  with 
which  operators  and  non-programmers  grasp  and  use  the  utility  programs  is 
important. 

Data  base  save,  restore,  backup,  and  recovery  programs  must  be 
flexible  and  easily  used.  The  time  when  these  programs  are  needed  is 
seldom  the  most  advantageous.  Speed,  especially  in  an  on-line  environment, 
may  be  essential  to  keep  vital  operations  running. 

Only  a few  DBMS  utilities  can  operate  on  the  data  base  while  other 
DBMS  functions  are  processinr.  This  feature,  again  in  an  on-line  opera- 
tion, may  be  very  important  to  assure  proper  security  of  the  data,  bases 
without  interruption  of  normal  processing. 


Chapter  ? 


EVALUATION  OF  DBMS  DATA  BASE  STRUCTURES 


’Anile  the  number  of  access  methods  supported  by  hardware  vendors 
to  handle  data  storage  is  limited,  the  methods  vhich  can  be  used  to  store 
data  using  these  access  methods  is  almost  limitless.  A majority  of 
DBMS ’ s implemented  on  the  IBM  360/370  utilizes  the  Basic  Direct  Access 
Method  (3DAM)  as  the  storage  vehicle,  but  the  similarity  ends  there. 
Different  intentions  for  the  use  of  data  dictate  different  storare 
methods.  Evaluation  of  any  DBMS  must  include  the  comparison  of  these 
storage  methods  to  determine  which  one,  if  any,  processes  data  in  the 
manner  desired  by  the  prospective  user. 

Program  Independence 

Can  the  structure  of  the  data  base  be  modified  without  requiring 
the  recompilation  of  programs  using  the  structured  area? 

Can  the  structure  of  the  data  base  be  upgraded,  adding  new 
applications,  without  affecting  the  operation  of  existing 
programs? 

Can  data  elements  be.  added  to  or  deleted  from  a record  used  by 
an  application  program  without  forcing  recompilation  of  the 
program? 

Can  the  length  of  a data  element  be  modified  without  forcing  the 
recompilation  of  a program  using  the  modified  data  element? 

Program  independence  is  one  of  the  most  important  advantages  of  a 

properly-designed  DBMS.  Program  independence  remits  the  structure  of 

the  data  base  to  be  modified  and  enhanced  with  minimum  imract  or.  all  aopli- 

(19:3-5,36:35-36) 

cation  programs  using  the  data  base.  No  i mi  act  at  all 

should  be  evident  on  programs  whose  records  are  not  modified.  Reconciling 
programs  to  adjust  for  data  base  structure  changes  becomes  more  and  more 
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impractical  as  the  scope  of  the  data  base  increases  and  the  number  of 
programs  gr ovs . 

A few  data  base  systems  permit  addins  new  data  elements  to  an 
existing  record  definition  without  recompiling  the  programs  using  the 
record.  Of  course  if  the  program  must  use  the  new  data  element,  recompi- 
lation is  often  necessary. 

Modification  of  data  element  size  can  have  significant  impact  on 
using  programs.  Few  031-3  implementations  can  keen  such  a chance  trans- 

(1:5,11) 

parent  to  the  using  program's  logic. 


Hardware  Independence 


♦ 


Is  the  logical  structure  of  the  data  base  independent  of  the 
storage  device  used? 

Ooes  the  DBMS  support  storage  of  the  data  base  on  both  disk  ^.nd 
tape  devices? 

Is  the  physical  device  type  transparent  to  the  application  program 
using  the  data  base? 

Can  the  size  of  data  base  storage  areas  be  modified  without 
altering  the  structure  of  the  data  base  or  the  programs 
which  use  it? 

Can  the  size  or  device  location  of  the  data  base  be  altered  with- 
out forcing  the  unloading  and  reloading  of  the  data  base? 

Can  the  programs  using  the  data  base  be  restricted  to  certain 
physical  units  of  the  data  base  without  affecting  the  logic 
of  the  application  program? 


The  ability  to  define 


data  base  structures  and  populate  them 


without  regard  for  the  type  of  storage  media  is  a definite  advantage.  All 
DBMS  implementations  support  disk  processing.  The  direct  access  approach 
is  consistently  the  mainstay  of  DBMS  storage  philosophy.  Disk,  however, 
is  not  the  only  storage  device  which  supports  direct  access.  Magnetic 
drums  and  staged  devices,  such  as  the  new  IB’?  3?50  mass  storage  unit,  are 
all  serviced  by  direct  access  techniques. 

Few  data  base  systems  support  magnetic  tape  processing.  This 
unfortunate  circumstance  forces  new  DBMS  users  to  alter  their  processing 


medium  at  the  same  time  they  alter  their  file  philosophy.  There  are 
still  numerous  aoDli cat ions  where  disk-residence  of  low-usage  data  is 

OU,l5) 

not  cost-effective. 

Change  being  the  by-word  of  ADF,  the  ability  to  alter  the  siz? 


cf  data  base  storage  areas  or  the  devices  upon  which  they  reside  is  a 

(n) 

very  important  feature  of  the  generalized  DEIS. 

Security  and  flexibility  of  data  base  structuring  is  further 

enhanced  by  the  ability  to  restrict  certain  data  base  areas  to  physical 

storage  units.  This  assignment  restriction  crust  be  transparent  to  the 

programs  using  the  data.  This  feature  is  particularly  useful  where  data 

is  sensitive  and  needs  to  be  phvsicallv  removed  and  olaced  in  a secure 

(36:1^-55) 

area  without  affecting  the  operation  of  the  system  as  a whole. 


Data  Sets 


’That  is  the  minimum  number  of  data  sets  which  must  be  established 
to  operate  the  DBMS? 

'That  is  the  maximum  number  of  data  sets  which  are  sunoorted  by 
the  DBMS  ? 

How  many  data  sets  are  required  for  effective  use  of  each  logical 
data  base? 

Must  data  sets  be  assigned  in  any  order,  in  any  sequence,  or  to 
specific  devices  to  achieve  optimum  performance? 

Car.  multiple  data  sets  be  defined  for  one  logical  cats  base? 

Can  multiple  logical  data  bases  be  defined  within  one  data  set? 

T,That  is  the  principal  access  method  employed  to  access  data  sets? 

Does  the  DBMS  support  multinle  access  methods? 


The  number  of  data  sets  required  by  a DBMS  is  not  a critical 

selection  criteria  in  a large  installation.  In  a small  installation  with 

a limited  number  of  disk  units,  system  performance  mav  be  degraded 

(13) 

seriously  if  sufficient  disk  units  are  not  available. 


The  DBMS  must  permit  a variable  number  of  caw?- set?  to  be  assigned 
This  number  car.  be  expected  to  grow  continuously  over  a period  of  time  as 
more  and  more  applications  are  integrated  into  the  data  has*  structure. 


r ^e'.zzr zz 


Ln  the  assignment  of  data  sets  enhancer  the  useful- 


ness  ol  tne 


. If  an  amplication  usinr  sensitive  dsua  recuires 


several  disk  packs  because  of  optician  c’aua  set  distribution,  the 
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willingness  of  the  user  to  implement  the  DBMS  is  often  reduced. 

Most  DBMS  implementations  utilize  the  BDAM  as  their  means  of 
communicating  with  disk  storage.  A few  have  developed  their  own  methods. 
Both  approaches  have  merit  and  hazards.  TJsinm  5DAK  recognizes  its 
stability  and  ease  of  utilization.  It  also  subjects  the  Dr.3  to  modifica- 


tions which  na*r  be  implemented  by  the  vendor  to  the  BDAM  software. 

Independent  access  methods  may  be  more  efficient  but  increase  the 

complexitv  and  maintenance  costs  of  the  DBMS.  Kanv  different  access 

(31:150-51,33:195-96) 

methods  are  available  today. 

A very  useful  feature,  supported  by  few  DBMS  implementations,  is 
multiple  access-method  use.  If  this  feature  is  available,  the  user  may 
define  the  method  which  processes  his  data  best. 


Data  Structures 

Is  the  data  base  structure  adequate  to  project  data  in  the  way 
it  exists  in  the  real  world,  or  must  data  structures  be 
adjusted  to  meet  the  restrictions  of  the  DBMS? 

Does  the  DBMS  limit  the  size,  quantity,  mode,  and  organization 
of  data  elements  in  any  way?  If  so,  to  what  extent? 

Does  the  DBMS  support  variable -length  records? 

Vhat  is  the  minimum  and  maximum  lengths  permitted  for  data 
elements  and  records  within  the  data  base? 

Are  structural  relationships  within  the  data  base  dependent  on 
actual  data  values  repeated  between  related  records? 

Does  the  system  permit  modifications  of  data  elements  which  are 
used  as  key  fields  in  relating  and  ordering  data  elements? 

Does  the  DBMS  structure  approach  permit  effective  reduction  of 
redundant  data  elements  between  record  types? 

Mhat  form  of  structure  is  used  to  relate  associated  data  records 
together  (i.e.,  chain,  inverted  list,  pointer  arrays) ? 

Can  ir.dexinr  be  provided  both  forward  (next  record)  and  backward 
(prior  record)? 

Does  the  DBMS  permit  compactinr  of  data  record  occurrences  to 
reduce  storage  usage? 

compacting  is  permitted,  what  is  tne  processing  overhead 
associated  with  the  compacting  process? 


Tr 


The  relationship  oetween  data  elements  has  grown  nore  complex  as 
the  structure  of  business  expands.  A major  problem  in  keeping  AT'?  appli- 
cations current  with  the  needs  of  organizations  has  been  the  intricacies 
of  data  relationships.  It  is  very  important  that  a DBMS  be  able  to 
interrelate  data  elements  as  they  actually  exist  in  the  real  world,  linen 
it  is  necessary  to  adjust  or  "bend"  data  structures  to  meet  the  restric- 
tions' of  the  DBMS , a great  deal  of  flexibility  is  lost,  further,  the 

( ic , 17) 

increase  in  program  complexity  is  measured  in  orders  of  magnitude. 

Most  computers  provide  the  ability  to  represent  data  in  several 
forms  or  modes.  The  DBMS  must  also  be  able  to  process  data  in  all  nodes 
available  to  the  host  computer.  Restrictions  on  the  size,  quantity,  or 
organization  of  the  data  frequently  make  conversion  of  existing  data 
bases  difficult,  if  not  impossible. 

Variable -length  records  are  used  extensively  to  improve  storage 

efficiency  and  flexibility.  This  advantage  should  be  offered  bv  the  D3MS 

(19:3-7) 

vendor  as  a basic  part  of  the  package  features.  Some  vendors 

have  provided  an  alternative  approach  to  variable -length  records  which 
permits  the  variable  portion  to  be  defined  as  subsidiary  records.  "chile 
this  approach  will  solve  the  data  storage  requirements  in  most  cases,  it 
is  not  adequate  to  meet  the  original  needs  of  the  user. 

VJhen  the  DBMS  vendor  places  restrictions  on  the  size  and  quantity 
of  data  elements  and  records  within  the  data  base,  he  places  an  added 
burden  on  the  user.  The  user  must  be  constantly  conscious  of  this 
restriction  when  designinr  a data  base.  The  user  must  also  be  careful  to 

(n) 

allow  room  for  future  growth  of  the  records.  If  his  crystal  ball  is 

inaccurate,  he  may  find  himself  having  to  completely  restructure  the 
original  design  of  the  data  base  after  millions  of  records  have  been 
stored. 


1 


The  accented  method  of  relating  similar  data  records  in  sequential 


data  files  is  through  the  data  itself.  Key  fields  containing  live  data 
determine  the  relationship  and  position  of  a record  within  the  file. 

This  arrangement  has  one  major  disadvantage.  The  user  is  unable  to 
modif3r  the  contents  of  the  key  field  without  major  structural  modifica- 
tion of  the  data  base  contents.  Unfortunately  some  DBMS  implementations 
also  rely  on  the  contents  of  data  fields  to  provide  the  relational  linkage 
between  records.  Repeating  the  key  field  between  members  and  owners  is 
very  wasteful  of  storage  space  and  prevents  modifying  the  key  field.  The 
impact  of  this  technique  becomes  clearer  when  multiple  set  structures 
are  considered.  The  records,  which  are  owners  or  members  of  more  than 
one  set,  can  reach  the  point  where  modification  of  many  data  elements  is 
prohibited  because  of  their  use  as  key  fields. 

It  is  important  that  the  DBMS  allow  key  fields  to  be  modified  as 
necessary.  Mien  a key  field  is  modified,  the  record  in  which  is  resides 
must  be  relocated  logically  in  the  sets  in  which  it  participates.  The 
redundant  key  field  technique  described  above  is  very  wasteful. 

One  of  the  major  features  of  a good  DBMS  is  effective  and  efficient- 

utilization  of  data  storage  mace.  Data  redundancv  reduction  or  outright 

(36:3?) 

elimination  is  a mandatory  feature  when  selecting  a DBMS. 

Several  structural  approaches  are  used  to  associate  data  within 
the  data  base.  The  most  popular  is  chaining;  the  second,  inverted  lists. 
Different  uodate  and  retrieval  csoabilities  are  present  for  each  method. 

(31:155-162,33:189) 

(See  Chapter  3.) 

Indexing  of  data  normally  utilizes  lists  or  chains  with  forward 
pointers  looking  to  the  next  record  in  the  index.  It  is  very  useful  to 
be  able  to  optionally  point  to  the  previous  record  in  the  index  (a  trade  off 


I 


: 


t 

i 


of  improved  performance  versus  reduced  rtoraae  space, . 
scanning  sormed  lists  in  reverse  order  cr  oackinr  up  to 
record  without  having  to  walk  the  chain  or  list  all  the 


one 


previous  record. 

As  part  of  the  efficient  management  of  storage  space  the  DEI-3 

must  be  able  to  compact  data,  removing  unnecessary  blanks  and  repetitive 

characters-.  The  reduction  in  data  storage,  depending  on  the  format  of 

data,  can  often  exceed  50  per  cent.  The  cost  of  using  data  compaction 

techniques,  however,  must  be  considered.  The  number  of  records  compacted 

may  depend  on  hovT  much  processing  time  is  required  to  perform  the  conoac- 

(36:1433-31)' 

tion  (and  its  opposite,  expansion)  process. 


IQDASYL  Relationship 

Does  the  DBMS  conform  to  the  data  structures  recommended  by  the 

PTl  C,rT  Tvry " O 

Does  the  syntax  of  the  DBMS’s  DDL  conform  to  CCDASYL  specifica- 
tions? 

Are  any  limitations  placed  on  the  extent  of  adherence  to  CCDASYI 
specifications?  If  so,  define  in  detail. 

Are  all  eight  CCDASYL  data  relationships  supported?  If  not, 
which  ones  are  unsupported  ? 

Does  the  DBMS  support  the  CCCASYL  subschema  technique? 


The  CCDASYL  Data  Base  Task  C-roup  (DBTC-)  specifications  have  become 


increasingly  accepted  by  the  data  processing  industry  as  the  standard  to 
be  followed  for  IBIS  development.  The  potential  user  needs  to  consider 
the  impact  of  such  standardization  prior  to  purchasing  a particular  DBMS 
package.  The  packages  which  do  not  conform  to  the  CCDASYL  specification 
will  ultimately  have  to  do  so  to  survive.  At  the  time  the  vendor  chanres 
to  conform,  users  will  face  a painful  change  to  syntax  and  procedures  or 
be  forced  to  continue  usir."  an  increasingly  obsolete  version  of  the  older 
package . 


1 


The 


verm 


Data  Definition  Language  (DDL)  defined  by  CCCASYL 

similar  to  COBOL  in  syntax.  It  is  very  flexible  and  easy  to  use.  The 

options  defined  in  the  DDL  syntax  cover  such  a range  of  data  definition 

that  most  DB!S  vendors  have  implemented  only  a small  portion  of  those 
• (17:65-11:6) 

available . 

Few  DBMS  vendors  have  implemented  more  than  half  of  the  CQDASYL 
specifications  for  DDL  and  Data  Manipulation  Languages  (DILL)  . It  is 
important  to  evaluate  the  extent  that  DDL  and  DHL  features  have  beer, 
ir.nlemented  to  determine  if  the  system  can  or o vide  the  tools  necessarv 

(Ui) 

for  the  installation's  environment.  Partial  use  of  CQDASYL  combined 

with  non-CQDASYL  accroaches  must  be  carefullv  reviewed  to  determine  if 

(I?) 

future  compatibility  will  be  affected. 

CCDASYL  defines  eight  data  structure  relationships.  Sever. 

(33:i??-20r) 

structures  were  defined  in  the  original  definitions.  The 

DBTG  added  the  eighth  in  December  Lc7r . The  newest  structure  allows  a 
record  to  be  related  to  itself. 

The  subschema  technique  is  a ma,1or  feature  of  the  CQDASYL  specifi- 
cation. The  subschema  operates  as  a filter  between  the  amplication  promra 
and  t.ne  data  base,  a "window’’  into  the  data  base.  As  such,  the  subschema 
allows  the  application  program  to  view  a portion  of  the  data  case  which 
it  must  access  tc  perform  its  functions.  The  program,  and  its  user,  is 
not  allowed  access  to  the  rest  of  the  data  base.  Indeed,  the  user  does 
not  know  any  other  data  base  exists. 


The  DB11S  provides  one  of  the  best  opportunities  to  in.te prate  cams 
and  reduce  processing  costs  that  has  been  available  to  the  ALP  community 
since  its  inception.  It  also  presents  an  unprecedented  opportunity  for 
compromise  of  that  data.  Separate  files  can  be  penetrated  one  by  one. 

But  without  access  to  a complete  set  of  files,  the  whole  picture  of  the 
data  cannot  be  obtained  and  compromise  is  seldom  complete.  The  integrate: 
data  base  removes  these  obstacles. 


Internal  Security 

Does  the  DBKS  implement  passwords  to  control  access  to  the  data 
base? 

At  what  level  (i.e.,  data  set,  set,  record,  data  element)  can 
passwords  be  employed? 

Can  selectee:  data  within  records  be  encrypted?  If  so,  what  is 
the  processing  overhead  associated  ’’ith  encryption? 

Does  the  LEMS  implement  passwords  to  control  access  to  the  '’aba 
dictionary? 

Can  the  data  base  be  structured  in  such  a way  that  certain  rata 
base  record  types  are  assigned  to  a physical  device? 

Passwords  are  the  most  common  and  easily  implemented  means  of 
securing  data.  Unfortunately  passwords  by  themselves  are  easily  compro- 
mised. A scheme  of  multiple  passwords  and  additional  checks  is  necessary 
to  achieve  an  acceptable  protection  level.  The  D3>2  must,  possess  some 
means  of  password  protection  of  its  data.  The  optimum  arrangement  is  to 

permit  oas sword  definition  at  all  levels  (i.e..  data  set,  set,  record, 

(17:103) 

data  element)  tpa  ,jc0’y,tc  protection  to  t'^ie 

record  level  is  fairlv  cordon  ar.onr  DIMS  implementations . Data  element 


storage  space. 

Another  security  method  is  encrypting  the  dets  within  the  cats 
base.  Encrypting  venerates  significant  processing  overhead.  Unless 
selected  data  elenents  can  oe  encryrted  instead  of  the  whole  record  or 
file,  inefficient  processing  will  result. 

Several  DECS  implementations  include  a data  dictionary  as  part  of 
the  package.  This  dictionary  is  the  most  sensitive  file  in  the  DBI'S  data 
base.  Access  to  the  data  dictionary  provides  a penetrator  with  complete 
descriptions  and  structures  of  the  entire  data  base.  ‘Jith  such  a road 


man,  conuronise  of  the  data  base  becomes  relativelv  easv.  extensive  oas 

' (17:75-7?) 

word  nrotection  is  needed  to  secure  the  data  dictionary. 


Each  DM  verb  (store,  modify,  delete,  etc.)  should  be  individually 

(i 7 • op  ~ ) 

protected,  to  restrict  application  programs  to  their  functions. 


This  feature  is  an  integral  Dart  of  the  CQDASYL  subschema  and  can  be 
readily  implemented.  A physical  security  constraint  can  be  imposed  if 
data  can  be  organized  so  that  all  record  occurrences  of  a certain  type  or 
sensitivity  level  are  located  on  one  device,  such  as  a disk  pack.  This 


facilitates  removal  of  the  pack  when  access  to  the  sensitive  data  is  not 
(36:31) 


required.  This  approach  is  the  least  expensive  but  reduces  flexi- 

bility and  resnonse  time  to  data  reauests. 


Data  Set  Security 


Can  data  sets  be  restricted  to  certain  users? 

Car.  data  sets  be  physically  separated  from  the  data  base  and 


normal  operation  of  the  D3I1S  cont'ru®? 

?ar  the  data  sets  be  secured  to  nrevent  ’'cumns”  of  their  contents' 


Data  sets  are  the  operating  system's  reference  to  the  data  bass, 
icentifvine  where  and  now  much  snace  is  set  aside  for  data.  It  is  essenmia' 


aside  for  data 


that  the  DBMS  provide  OTimm  flexibility  and  security  to  bats  set 

selection  and  assignment.  The  DBMS  must  be  able  to  selectively  define 

certain  data  sets  as  belonging  to  snecific  users  and  containing  only  that 
(1?: 1-11,11, 113) 

user's  data. 

It  is  a desirable  feature  of  the  L3KS  that  physical  data  sets 

(disk  sacks,  etc.)  be  removable  from  the  active  data  base  without  affect- 

(27) 

ing  the  normal  ODeration  of  the  DBMS.  This  facilitates  physical 

(19) 

separation  and  security  of  very  sensitive  data. 

It  is  generally  acknowledged  that  data  base  security  halts  at  the 
machine  room  door.  Operators  and  programmers  can  readily  print  the 
contents  of  data  sets  using  standard  utility  programs.  It  is  useful  for 
the  data  base  administrator  (DBA)  to  have  the  ability  to  prevent  such 
printing  or  to  make  the  output  meaningless  (encrypting). 


Functional  Security 

How  does  the  system  prevent  one  user  iron  accessing  data  belong- 
ing to  another  user? 

Can  a user  gain  control  over  the  DBMS  system  buffers  through 
manipulation  of  his  own  data  area  addresses? 

Is  the  subschema  approach  defined  by  CCDASIL  utilized  fully? 

An  integrated  data  base  contains  data  which  "belongs"  to  many 

users.  Typically  each  user  is  concerned  that  access  to  "his"  data  by 

other  users  is  restricted.  This  restriction  recruirenent  nav  vary  from 


minor  to  total.  The  DBMS  must  be  canable  of  enforcing  such  variable 

( 

restrictions  and  varying  the  level  of  restriction  from  user  to  user. 


(19:1. 


The  manner  in  which  the  restriction  capability  is  implemented  must  be 
carefully  reviewed.  Imarooerlv  handled,  this  feature  can  have  serious 

(15) 

impacts  on  program  logic. 


Most  DB!S  implementations  utilize  core  buffers  in  which  blocks 
(pages)  of  data  are  stored  while  being  used  by  application  programs.  Bach 


pare  commonly  contains  data  from  r.ore  than  one  user,  and  the  con^inatio* 


rr 


access  to  these  buffers  and  read  data  beyond  that  authorized.  DBMS 

innler.entat ions  mist  guard  against  this  invasion  when  permitting  multiple 

(17,1?) 


users'  data  to  coexist  in  the  data  base. 

The  subschema  approach  defined  by  CCDASYL  provides  a selective 
window  into  the  data  base  which  may  be  tailored  to  each  individual  appli- 
cation requirements.  The  subschema  may  be  very  sophisticated  and  provide 

a large  measure  of  protection  while  enhancing  the  independence  of  the 

(17*353-196) 


application  program  from  the  data  base. 


Interritv 


Does  the  system  provide  the  means  to  validate  raw  data  element 
occurrences  prior  to  the  storage  of  the  data  into  the  data 
base? 

Can  entry  of  data  into  the  data  base  be  automatically  restricted 
to  prevent  conflicting  data  from  being  stored? 

Does  the  DBMS  permit  concurrent  update  of  individual  data  vase 
records  by  multiple  users? 

How  is  "deadly  embrace"  avoided? 

Can  access  to  areas  of  the  data  base  be  restricted  to  control 
the  level  of  multiple  accesses  to  any  one  portion  of  the 
data  base? 

How  is  file  lockout  between  contending  users  prevented  by  the 
system? 

The  CCDASYL  specification  provides  the  facility  to  validate  ra” 

(17 :?[-??) 

data  before  storing  the  data  in  the  data  base.  This  feature 

has  not  been  widely  implemented  but  has  the  potential  to  increase  the 
DBA. 1 s control  over  the  data  base,  preventing  insertion  of  erroneous  data. 

It  is  very  desirable  to  be  able  to  control  the  insertion  of  new 
data  into  the  data  base  so  that  it  is  consistent  with  existing  data  alrea 
stored.  This  feature  oar.  be  programmed  by  the  using  run-unit.  This 
approach,  however,  is  beyond  the  immediate  control  of  the  DBA.  ! better 


control  car.  be  achieved  if  the  DBA  has  direct  control  over  the  insertion 

(17 :9ii-?7) 

function. 

Concurrent  update  occurs  when  two  independent  users  of  the  data 

base  simultaneously  attempt  to  update  a specific  record  occurrence.  This 

condition,  while  not  common  in  actual  practice,  can  destroy  the  integrity 

of  a data  base  in  a short  time.  A DBMS  system  must  have  the  ability  to 

overtly  prevent  such  contention  conditions  from  occurring.  Deadly  embrace, 

the  fear  of  all  systems  people  who  deal  in  multiple  users  of  a single  data 

file,  occurs  when  two  users  attach  resources  serially  as  they  require 

them.  Over lapping  requests  can  render  both  users  inactive  without  recovery 

The  method  employed  by  DBMS's  to  avoid  or  recover  from  this  condition  is 

important  to  uhe  ability  of  a DBMS  to  operate  without  significant  operating 

(20,27) 

problems. 

Contention  and  deadly  embrace  can  be  avoided  by  distributing  the 
data  over  several  areas  of  the  data  base.  User  access  can  be  controlled 
more  effectively,  and  the  number  of  users  accessing  any  one  area  is  reduced 
File  lockout  occurs  during  periods  of  contention  by  multiple  users. 
'■Tien  one  is  updating,  others  must  be  prevented  from  also  updating.  It 
may,  however,  be  desirable  to  permit  some  users  to  retrieve  while  others, 
are  updating,  particularly  in  on-line  environments.  The  DBMS  must  be  able 
to  selectively  control  lockout  to  assure  maximum  effective  use  of  the  data 
base  while  maintaining  maximum  integrity. 


Recovery 


Toes  the  DBMS  provide  a method  for  taking  regular 
its  operation' 

Are  checkpoints  automatic,  initiated  by  operators 
programs,  or  a combination  of  the  above? 


checkpoints  cf 
initiated  by 


Do  checkpoints  include  core  images  of  the  prorrams  using 
How  rapidly  can  the  DBMS  be  restarted  from  a checkpoint? 


the 


reaper 


Lobs  ‘tb©  L3KS  oirovodB  rscLLi'ti.Bs  Tor  ohvslcsL  backus  cf  caba  basBs 
on  tape  or  disk? 

Does  the  331-25  automatically  recover  from  the  failure  of  an  appli- 
cation program  using  the  data  base  when  operating  ir.  batch  and 
on-line  nodes?  Must  operator  intervention  be  used  tc  assure 


recover- 


'an  the  DBMS  automatically  recover  fro."  hardvrare  fail’ure  associ 


with  the  devices  and/or  channels  on  which  the  data  base  res 

linen  total  hardware  failure  (CPU  halt  or  power  failure)  occurs,  can 
the  DBMS  oe  restarted  "warn”  or  must  a "cold"  restart  be 
performed? 

linen  restarting  "cold,"  how  much  time  is  required  before  normal 
operation  may  resume? 

1-hen  software  or  hardware  failure  causes  the  DBMS  to  fail,  what 
damage  to  the  data  base  is  likely? 

Can  the  data  base  damage  be  recovered  through  use  of  standard 
vend or -supplied  utilities? 


Data  base  recovery  is  one  of  the  critical  aspects  of  any  DBMS, 


Vithout  the  ability  to  recover  following  abnormal  situations,  a data  base 

(12) 

is  useless. 


Checkpoints  are  the  classical  method  of  recovering  from  malfunctions 


during  lone  processing  runs.  1,Then  a DBMS  operates  in  a multi-tasking 


environment,  regular  recovery  points  are  necessary.  These  checkpoints 


should  be  frequent  enough  that  recoverv  can  be  accomplished  cuicklv  with 

(n) 

minimum  disruption  of  service  to  users. 


Checkpoints  should  be  automatically  taken  tc  assure  continuitv. 


is  useful,  but  not  essential,  that  operations  personnel  be  able  to 


initiate  additional  checkpoints. 


Classical  checkpoint  technicues  have  included  core  inaces  of  the 


'ograr.  executing.  Mhile  useful  to  recovery  of  a batch  program,  this 


feature  becomes  an  overhead  factor  in  on-line  multi-programming  environ- 


ments. If  included,  the  core  image  feature  should  be  optional. 


Restart  of  the  DBMS  from  a checkpoint  must  be  accomplished  rsriclv. 


In  an  on-lint  environment,  speed  is  essential  and  may  be  very  critical  to 


ohe  users  supported  bv  the  DBMS. 


I 


Associated  vith  checkpoints  is  the  use  of  a physical  backup  of 
the  data  base  on  tape  or  disk.  These  ’’saves"  of  the  data  base  are 
typically  used  as  the  points  where  reconstruction  of  a damaged  data  base 
begins.  It  roast  be  possible  to  save  a data  base  at  any  point  where  its 
integrity  is  known  to  be  intact,  typicallv  at  a cuiescent  dr  idle  point 

(17) 

in  processing. 

• '•’hen  an  application  program  updating  the  data  base  abnormally 
terminates,  the  data  base  is  left  partially  changed.  Left  alone,  this 
condition  damages  the  integrity  of  the  data  base.  It  is  important  to 
remove  the  effects  of  the  program  by  returning  the  data  base  to  the  condi- 
tion prior  to  the  program's  start.  Such  a feature  must  be  automatic  when 
oneratincr  in  an  on-line  or  multi -tasking  mode.  Ooerator  intervention 

(n) 

should  not  be  required  except  in  a single-task  batch  mode. 

Hardware  failure  is  the  weak  spot  of  most  implementations. 

Few  systems  can  recover  automatically  from  failures  associated  with  the 
devices  and/or  channels  which  serve  the  data  base.  It  is  essential  that  a 
smooth  recovery  be  allowed  by  the  system  without  reloading  the  data  base 
if  it  has  not  been  damaged. 


’•Then  total  hardware  failure  occurs, 


execution  of  the  DBMS  halts  at 


an  abnormal  location  if  it  was  executing  at  the  time  of  the  failure.  This 
halt  may  prevent  the  DBMS  from  being  restarted  upon  recovery  of  the  hard- 
ware. It  is  important  to  on-line  users  that  the  system  begin  operation  as 
soon  as  possible.  A "warm"  restart  permits  the  system  to  continue  where 
it  halted  vith  little  or  no  impact  on  its  operation.  A "cold"  restart 
often  requires  a complete  reloading  of  the  data  base  and  restarting  of 
Derations  from  a previous  checkpoint.  The  time  difference  between  warm 


and  cold  starts  can  be  significant,  and  a warm  restart  capability  is  very 
desirable . 


1 


18:3  performance . 

Part  of  the  price  a user  pays  for  a DDKS  package  goes  toward  the 


utilities  needed  to  provide  auxiliary  support.  Bate  base  recovery 
utilities  are  essential  to  operation  of  the  DBKS.  Without  them  the 
system  cannot  be  maintained , and  the  user  is  forced  to  write  his  own 
recovery,'  capability.  Few  users  have  the  extra  resources  available  to 
write  such  programs.  Depending  on  the  complexity  of  the  DEI-3  file 
structure,  the  task  may  be  beyond  the  users  abilities.  Standard  user- 
supplied  recovery  utilities,  therefore,  are  a mandatory  feature  of  the 
DEI-3  packarre. 


EVALUATION  CF  D3I-E  PROGRAM 


' rn^'OT?  £ ' 


A EB’E  is  a powerful  tool  to  increase  the  scone  of  A_DF  process; 
It  is  useless,  however,  unless  it  can  be  properly  used.  The  ease  with 
which  programmers,  analysts,  and  end  users  can  utilize  the  feature  of 
the  DBMS  will  bear  neavilv  in  its  evaluation. 


Languages  Supported 


Mat  programing  languages  nay  be  used  in  conjunction  with  the 

TYD**Q  9 

How  are  the  interfaces  effected? 

linen  a language  prenrocessor  is  used,  does  the  preprocessor  per- 
form syntax  and  lorical  validation  of  the  Data  Manipulation 


(DM.)  si 


azerams  us  8 o c 


joes  the  COE  CL  language  interface  comply  with  CCD  A DID 
cations? 


S T?  8 C 2.1 


- -Tn  . 


T-Thile  most  data  base  system  publicity  is  aimed  a the  business 

user,  scientific  and  statistical  data  is  often  cart  of  an  intecrated  data 

base.  It  is  important  that  all  users  and  potential  users  of  the  DBMS  have 

reasonable  access  to  the  data  base  through  the  language  which  ^esl  supports 

(6) 

their  application. 

Interface  with  the  DBM  may  be  accomplished  through  several 
approaches.  Nearly  all  of  the  systems  implemented  currently  provide  the 
abilitv  to  access  the  DP’S  software  through  the  CA.LL  statement  in  those 


tannages  wnzcn  implement  calls, 
direct  D 


Several  implementations  also  provide 


statement  usage  which  accesses  the  DBMS  either  directlv  or  hv 

(?1 :3C : -l" ,169-7 C) 

the  DM  verbs  to  call  statements. 


The  use  of  Preorocesso r to  convert  DHL  coT.ar.cs  to  cell  statement 5 

2_c  becoming  r^or’c  ^onnon*  Tbs  ss^vicss  i^ovadsd  bTr  "t,bc  ce  T,Y'Gyzv%ort{3c‘S3'y'£ 

ranees  fror.  a straight  one-for-one  conversion  of  DHL  commands  to  calls  to 

extensive  syntactical  and  logical  validation  of  the  source  program  code. 

The  treater  the  level  of  orocram  verification  performed  bv  the  creoroces- 

■ ( 7'  c\ 

\ I 3 - ) 

sor,  the  more  effective  and  accurate  the  program.  code  generated. 

The  CODA SYL  DHL  structure  has  been  proposed  as  an  addition  to  the 

American  National  Standards  Institute  (ANSI)  CCE01  language  upgrade.  It 

is  likely  that  the  ANSI  standards  committee  will  annrove  the  proposed  DHL 

(17:201-6?) 

verbs  or  a very  similar  structure.  As  discussed  earlier,  the 

adoption  of  the  CCDA5YL  standard  can  have  posit ive  effect  on  those  users 
of  CODA. SYL- compatible  systems  and  a negative  effect  on  those  whose  DPNS 
does  not  conform. 


?r ogxammer  Interface 

1,"hat  level  of  nr 0 crammer  experience  is  required  to  effectively 
use  the  DELS?" 

Does  the  DBMS  perform  support  functions  which  re'ieve  the 

programmer  from  actions  which  are  repetitive  and/or  error- 
prone? 

Does  the  system  permit  definition  of  data  elements  usin~  symbolic 
names?  If  so,  what  limits  are  placed  on  the  structure  of 
the  names? 

Fevr  AD?  installations  have  the  luxurv  of  a totallv-exmerience- 


. f.  ihere  are  a 1 wavs  a few  Junior-level  personnel  on  the  staf 
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is  important  that  all  programmers  on  the  staff  be  able  to  effectivel"  use 

(77) 

the  DBN£.  Vithout  that  ability,  the  use  of  DBMS  will  be  limited  to 

(19:3-6) 

the  fe1'  senior  programmers  whose  skills  can  master  the  DBHS. 

Several  DBHS  systems  provide  support  features  which  relieve  the 

— *>• g **v-ig **  0**  i ^ O’J. S ? ^ ^ T* 0"*^ “ TM' 0 ge  G2"t  ^ v«g  c 


’oT'm.attin~ , call-statement  senerstion.  and  communications  area  codinm. 


The  greater  the  sun sort  provided  by  the  DBMS , the  less  effort  necessary'  by 

"t-^16  TT^O f^r*3TniT10v‘ • TV^c  cf ^ 0'VT"t  T*  6 Cl  11  C"t» i. OH  XHC^SSSSS  T-TOdllct —VXbV  "t/nr*  OU  ^b  Z.6S5 

program  codin~  and  fever  coding  errors.  Significant  coding  convention  end 
data  name  standardization  features  further  enhance  the  programmer  inter- 
face on  so me  systems. 


Meaningful  symbolic  representation  of  the  data  element? 
data  base  improves  the  readability  and  understanding  of  the  data 
structure.  The  number  of  characters  permitted  in  symbolic  names 
widely  among  available  imp  lenient  at  io  ns , ranging  from  a low  of  2 


base 

varies 

characters 


to  a high  of  16.  If  installation  conventions  require  descriptive  data 
names,  less  than  12  characters  will  present  severe  constraints  upon  naming 
within  the  data  base  structure. 


training 


foes  the  vendor  provide  adequate  training  for  ail  levels  of 
programming  staffs? 

Is  training  a reinforcement  of  manuals  and  guide,  or  is  formal 
training  necessary  to  effectively  use  the  system? 

Is  training  available  on  video  tape? 

Does  the  vendor  limit  the  number  of  participants  in  training 
classes? 

How  much  training  is  included  in  the  contract  price? 

Is  training  available  locally? 

Training  of  nrogramminr  staffs  is  a continuous  exnense  to  ail  AD? 

(37) 

installations.  The  addition  of  a DBMS  package  to  the  software  library 

requires  that  most,  if  not  all,  of  the  programmers  and  analysts  on  the 
staff  oecome  familiar  with  the  capabilities,  technical  features,  and  use 
techniques  of  the  D3KS . The  type  and  depth  of  training  differs  from  level 


to  level  within  the  staff.  Indication  programmers,  junior  through  senior, 
must  know  her-’  no  write  interacting  programs  which  will  use  the  DP  MS  file 
features.  Systems  programmers  must  understand'  the  interaction  of  the 
package  with  the  oneratinr  svstem  and  other  software  oacka^es.  The  systems 


jM 


he  system: 


analvst  needs  to  know  how  to  anoTy  the  capabilities  of  the  TPHS  yo  the 

Oi3) 

application  needs  of  the  installation's  users. 

Formal  classroom  training  is  often  used  to  expand  the  material 
provided  in  the  manuals  supplied  by  the  vendor.  Vhile  this  technique 
assures  a rounded  education  on  the  system,  it  lacks  the  ability  to  refer 
to  the  manual  for  information  once  the  instructor  has  gone.  It  is 
important  that  vendors  documentation  be  as  complete  as  possible  and  permit 
each  level  of  user  to  effectively  utilize  the  system  without  consulting 
continuously  with  the  vendor. 

Repeat  training  is  a costly  and  disruptive  necessity.  As  new  staff 
members  are  hired,  it  is  necessary  to  train  them  in  the  various  software 
packages  being  used  in  the  installation.  The  D3KS,  when  integrated  into 
the  installation's  operation,  must  be  understood  by  all  staff  members. 
Training  courses  available  on  video  tape  reduce  the  cost  and  improve  the 
timeliness  of  instruction.  Ifnile  not  as  personal  as  live  instruction, 
video  courses  may  be  run  at  will  and  can  be  repeated  as  often  as  desired. 
Programmers  previously  trained  may  replay  certain  tares  to  refresh  or 
reinforce  their  knowledge. 

Vendors  often  limit  the  number  of  persons  who  may  attend  a 
specific  training  course.  This  is  logical  when  the  course  is  offered  to 
the  general  public  or  when  close  individual  support  is  required.  However 
when  a course  is  offered  in-house,  the  vendor  should  be  flexible  to  the 
needs  of  the  user.  The  costs  of  training  can  be  reduced  significantly 
if  a larger  number  of  student?  are  permitted  in  classes  which  are  primarily 
le cture. 


Host  DK-5  package?  include  some  training  in  the  contract  price  of 
due  system.  The  amount  of  training  varies  videlv.  The  cuentitv  cf  trsinir. 


provided  if  seldom  adequate  to  fully  prepare  the  installation  for  !?'u 
use.  Supplemental  costs,  with  s few  systems,  may  approach  the  license 
price  of  the  D2KS  itself. 

The  availability  of  local  traininr  is  a very  imoortant  feature. 
The  cost  of  out-of-town  training  extends  beyond  collars  alone.  The  time 
required  for  travel  and  the  loss  of  the  trainee  services  curing  the 
course  period  can  have  severe  impacts  on  installation  effectiveness.  Few 
installations  can  afford  the  disruption  caused  by  extensive  use  of  remote 
uraininr. 


charter  uC 
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Data  base  management  systems  is  one  of  the  most  exciting  develocnen's 
on  the  computer  scene  today.  It  holds  the  potential,  properly  maided  and 
controlled,  of  expanding  the  scope  of  ADF  applications  far  beyond  its 
current  boundaries. 

The  pitfalls  facing  the  potential  user  of  DBI’5  are  enormous  and 
costly.  It  is  essential  that  careful  evaluation  and  comparison  of  avail- 
able packages  be  made.  As  part  of  this  evaluation,  a careful  meshing  of 
D3HS  features  and  installation  requirements  is  necessary  to  assure  effective 
use  of  a DBFS. 

It  is  possible  that  the  results  of  an  evaluation  will  indicate  that 
the  use  of  a DBIS  is  not  practical.  Tnen  this  occurs,  procurement  of  a 
D3KS  will  be  on  shakey  ground  and  potential  users  must  proceed  at  some 
risk. 

This  paper  has  provided  many  questions  for  potential  LEI5  users  to 
ask  of  vendors.  Often  the  answers  will  venerate  mors  questions.  PThen 
fully  explored , these  answers  will  provide  the  basis  for  a decision  to  buv 
or  not  to  buy. 

A final  comment:  "Let  the  buyer  beware"  is  particularly  applicable 
to  buying  a D3KS.  The  wrong  decision  will  have  lasting  long-term.  effects 
on  an  organisation  and  its  ALP  -urogram. 
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This  glossary  includes  terms  common  to  the  "SI'S  technology.  Only 
a few  of  the  most  dominant  or  confusing  have  beer,  included  from  the  many 
use-  when  discussing  data  base  systems.  A majority  of  the  definitions 
has  been  developed  using  references  17,  19,  33,  and  3o . 

Area  (realm) : A named  sub-division  of  the  addressable  storage 
space  within  the  data  base  to  which  records  may  be  as si me d independent 
of  set  membership.  A logical  piece  of  the  data  base. 

CALC : One  of  the  three  record  storage  modes,  based  on  the  compu- 
tation of  a data  base  storage  location  using  values  supplied  by  data  with: 
the  record.  Used  for  direct-access  records. 

Chain  pointers : A chain  of  pointers  which  can  be  followed  from 
record  to  record  and  provide  for  sequential  access  to  all  record^  in  the 
set  occurrence.  A storage  mode  for  sets  where  data  is  stored  with  linked 
organization  for  serial  access. 

Bata-aggregate ; A named  collection  of  data-itens  within  a record 
and  referred  to  as  a whole. 

Bata  base:  All  physical  record  occurrences,  set  occurrences,  and 
areas  defined  by  a specific  schema. 

Bata-item : The  smallest  unit  of  named  data  within  a data  base. 

An  occurrence  of  a data-item  is  s representation  of  a value  and  may  consi 
of  any  number  of  bits  or  bytes. 

Bata  set : A named  collection  of  physical  records,  including 
data  used  for  locating  the  records  (indices). 
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The  ianruame  used 


:Lj  lata  Definition  Language . The  language  used  for  cescnbing 
a data  base  or  that  part  of  a data  base  known  to  a program.  Defined  in 
terns  of  names  and  characteristics  of  data-items,  data-aggregates , record 
areas , and  sets,  and  the  relationships  existing  between  occurrences  of 
those  elements  in  the  data  base.  A language  for  the  logical  description 
of  the  data  base. 


Direct : One  of  the  three  record  storage  nodes  where  a unique 
identifier,  supplied  by  the  using  program,  identifies  the  location  in  the 
data  base  of  a record  occurrence. 

PMC  I ; Device-Media  Control  Language.  The  language  used  to 
describe  the  relationship  between  the  logical  schema  and  the  physical 
storage  space  used  to  store  data  base  records. 

DI-TL : Data  Manipulation  Language.  The  language  used  by  the 
programmer  to  communicate  with  the  data  base  system. 

First:  One  of  five  set  orders  in  which  the  new  member  record  is 
inserted  as  the  immediate  successor  to  the  owner  record  occurrence. 

Integrity:  Safeguards  against  occasional  failures  and  accidents 
which  can  occur  during  processing  of  a data  base.  The  safeguarding  of 
data  from  undesired  interaction  of  programs  against  the  data  base.  The 
checking  of  the  value  of  data  to  be  stored  in  the  data  base  to  assure  its 
consistence  with  data  already  present  in  the  data  base. 

Last : A set  order  where  the  new  member  record  is  inserted  as  the 
immediate  predecessor  to  the  owner  record  occurrence. 

Mandatory : A set  membership  condition  which  indicates  that  once 
the  membership  of  a record  occurrence  in  a set  is  established,  the  member 
ship  is  permanent. 

'■'ember  (child)  : A record  within  a set  subsidiary  to  and  dependen 


on  an  owner  record 


structure 


Network,  data  bass:  The  most  general  form.  of  data 
a network  any  given  element  nay  be  related  to  any  other.  data  structure 
in  which  an  n-to-n  relationship  is  permitted  between  elements. 

Next:  A set  order  where  the  new  member  record  is  inserted  after 
another  record  occurrence  which  was  the  latest  record  within  the  set  to 
be  accessed. 

Owner  (parent) : A record  whose  existence  establishes  the  existence 
of  a set  occurrence.  The  elementary  record  of  a set. 

Pointer  array:  Sets  organized  through  a list  of  member  record 
occurrences  stored  with  the  owner  record. 

Prior:  A set  order  where  the  new  member  record  is  inserted  before 
another  record  occurrence  which  was  the  latest  record  within  the  s°t  to 
be  accessed. 

Privacy:  Protection  against  unauthorized  access  cf  the  data. 

Refers  to  the  rights  of  individuals  and  organizations  to  determine  for 
themselves  when,  how,  and  to  what  extent  information  is  to  be  transmitted 
to  others. 

Schema : Consists  of  data  definition  (DPI)  entries  anc  is  a complete 
description  of  a data  base.  It  includes  the  names  and  descriptions  of  all 
of  the  areas,  set  occurrences,  reccrd  occurrences,  and  associated  data 
items  as  they  exist  in  the  data  base. 

Security:  Protection  of  data  against  accidental  or  intentional 
disclosure  to  unauthorized  Dersons,  or  unauthorized  modification  or 
destruction  of  a data  base. 

Segment:  Data-arrregate  and/or  record  which  contains  one  or  more 

« 

daza-items  and  is  the  basic  unit  of  data  which  passes  to  ?nJ  the 

application  programs  under  control  of  DBMS  software. 


Set: 


A named  collection  of  record  types 


» 


a set 


■-S  suer. 


establishes  the  characteristics  of  an  unlimited  number  of  occurrences  of 
the  set.  The  basic  structure  of  the  CCDASYl  language  specification. 

Subschema : Consists  of  data  definition  (DLL)  entries.  It  need 
not  describe  the  entire  data  base  but  only  those  areas,  sets,  and  record 
which  are  to  be  known  to  a specific  program  or  programs.  An  application 
programmer’s  view  of  the  data  base. 

Via:  One  of  three  storage  nodes  based  on  the  location  of  the 
owner  occurrence  of  the  set. 
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Montreal  Quebec,  Canada 
Hr.  Martin  Frobisher 
Corporate  Data  Base  Administrator 

17-  Southern  Railway  System 
Atlanta,  Georgia 
Dr.  Bill  Linn 
System  Director 

15.  U.  S.  Government 
Department  of  Labor 
Bureau  of  Labor  Statistics 
Washington,  D.  C. 

Dr.  Lester  Sachs 


depart. ment  i 
Washington, 
Mr.  Ken  Kin; 


